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Abstract. Sprouts is a two-player topological game, invented in 1967 by 
Michael Paterson and John Conway. The game starts with p spots, lasts 
at most 3p — 1 moves, and the player who makes the last move wins. In the 
misere version of Sprouts, on the contrary, the player who makes the last move 
loses. 

Sprouts is a very intricate game, and the first computer analysis in 1991 
reached only p = 11. New results were made possible in 2007 up to p = 32 
by using combinatorial game theory: when a position is a sum of independant 
games, it is possible to replace some of these games by a natural number, called 
the nimber, without changing the winning or losing outcome of the complete 
position. 

However, this reduction does not apply to the misere version, making the 
analysis of Sprouts (and more generally of any game) more difficult in the 
misere version. In 1991, only p = 9 was reached in misere Sprouts, and we 
describe in this paper how we obtained up to p = 17. 

First, we describe a theoretical tool, the reduced canonical tree, which plays 
a role similar to the nimber in the normal version. Then, we describe the way 
we have implemented it in our program, and detail the results it allowed us to 
obtain on misere Sprouts. 



1. Introduction 

Sprouts is a two-player game, which needs only a sheet of paper and a pen to 
play, with extremely simple rules: for example, Martin Gardner's article of 1967 
[4j (when the game was invented) is a good introduction to start the study of the 
game. 

Sprouts is a combinatorial game: two players play alternately, knowing all the 
possible information to choose their next move. There is no room for chance in the 
game. Moreover, Sprouts is an impartial combinatorial game: from any position, 
the same moves are available to either player. 

In the normal version of the game, the winner is determined by the following 
rule: a player who cannot make a move loses. Draws are not possible, and since a 
game beginning with p spots is finite (with at most 3p — 1 moves), there must be 
a winning strategy for one of the player (but for that, of course, he needs to play 
perfectly). 

Definition 1. Sp denotes the normal version of the Sprouts game starting with p 
spots (and S~ the misere version). 

Finding which player has a winning strategy is difficult because of the game 
complexity. The first manual analysis only achieved to solve S$ , and it required to 
consider a lot of cases through many pages of reasoning. In 1991, the first program 
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of Sprouts enabled Applegate, Jacobson and Sleator [T] to extend this analysis up 
to Si X , and to formulate the following conjecture: 

Conjecture 1. The first player has a winning strategy in Sp if and only if p is 3, 
4 or 5 modulo 6. 

Later computation [6] in 2007 proved this conjecture to be true up to S^ 2 - 

In the misere version of the game, the last player able to move is this time the 

loser. Games in misere version are almost always more difficult to analyze. For 

example, in the normal version, a disjunctive sum of two losing positions is losing. 

But in the misere version, the sum can be losing or winning, depending on the case. 

Detailed explanations about the difficulty of misere games are given in the Theory 

section. 

Because of this difficulty, Applegate et al. in 1991 only reached Sg in the misere 
version, upon which they formulated the following conjecture: 

Conjecture 2. (false) The first player has a winning strategy in S~ if and only if 
p is equal to or 1 modulo 5. 

This conjecture was invalidated by Josh Purinton (unpublished work), who com- 
puted first that the winning strategy for is actually for the first playeiQ. In fact, 
we observe that above S$ , we get back to a pattern of period 6, just as in the nor- 
mal version, but with a shift: known values up to now are shown in the table of 
figure [TJ 
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Figure 1. Computed Win/Loss state of S 



We can formulate the following conjecture: 

Conjecture 3. The first player has a winning strategy in S~ if and only if p is 
equal to 0, 4 or 5 modulo 6 - except if p = 1 or 4. 

We describe in this article how we achieved to analyze the game up 5*^., beginning 
first with some well-known notions of combinatorial game theory. Then, we explain 
how we implemented them in a program, and finally, we will give details on the 
computed results. 

In this article, we use the notation of Sprouts positions described in [BJ, which 
is derived from the notation of pQ. 

2. Theory 

In this section, we give an overview of some well known results of the theory of 
misere games, without proving again all of them. A good entry point for this theory 
is On Numbers And Games [5], a book from Conway (chapter 12, pp. 136-152), or 
the famous Winning Ways [2] from Berkelamp, Conway and Guy (chapter 13, pp. 
413-452). 



see |http : / / www. wgosa. org/ confirmation^ . htm | — Josh Purinton has also solved the misere 
version up to S± 6 . 
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2.1. Indistinguishability. We give first some standard definitions from the theory 
of impartial combinatorial games. A game in a given state will be called a position. 
The outcome of a position is W (win) or L (loss), depending on the existence of a 
winning strategy from this position. The sum of two positions &\ and S?2 is the 
position obtained by grouping them, so that at each move, the player can choose 
to move in the first or the second one. This sum will be denoted 2?\ + £?2- 

Following [7], we say that two positions and 3?2 are indistinguishable, and 
we note £P\ ~ ^2, if for any other position the positions £?\ + S? and &2 + & 
have the same outcome. This notion is of practical interest because if two Sprouts 
positions are known to be indistinguishable, it is possible to replace the biggest one 
by the smallest one in order to accelerate computation. 

Indistinguishability depends on the considered game, and also on its version: two 
positions can be indistinguishable in the normal version, but not in the misere one. 
For example, in the case of Sprouts, the start positions with 1 and 2 spots S\ and 
S2 are indistinguishable in the normal version (let us denote S\ ^+ S2). But in the 
misere version the empty position & distinguish them, since S\ is a Win and S2 a 
Loss (Si ^_ Si). 

2.2. Indistinguishability in the normal version. Indistinguishability allows us 
to simplify considerably any impartial game in the normal version (when the player 
who cannot move is the loser). But, first, we need to go back to a very simple and 
essential game: the game of Nim. 

Definition 2. A position £P is entirely defined by the set of possible moves from 
it. We note = {3*i, 3*2, ^3, — }, where 3? l are the children of 2?. 

Definition 3. Let m denote the Nim-heap with n matchstick^. 

• (D is a void heap. This is a terminal position, where no move is possible: 

© = {}■ 

• 1 is a heap of only one matchstick. The only possible move is to remove 
this matchstick, so is the only option: 1 = {0}. 

• 2 is a heap of two matchsticks: it is possible to remove one or the two 
matchsticks, and 2 = {©; 1}. 

• The general rule, for any positive number n is in = {(D; 1; m — 1}. 
The importance of Nim comes from the following result: 

Theorem 1. (of Sprague- Grundy) For any impartial combinatorial game, all the 
positions are indistinguishable from some Nim-heap, called the nimber of the posi- 
tion. 

In the case of Sprouts, 5 2 ~+ © and ABCD . >AB . >CD . >] ! ~ + 3, which means that 
the nimber of S2 is and the one of ABCD . }AB . >CD . >] ! is 3. 

Indistinguishability is an equivalence relation, and the corresponding equivalence 
classes are called indistinguishability classes. The Sprague-Grundy theorem then 
states that, in the normal version of impartial games, there are very few and par- 
ticularly simple indistinguishability classes. This greatly simplifies the analysis of 
games where positions appear as the sum of smaller ones. 

Unfortunately, this theorem does not apply to the misere version. There are 
many more indistinguishability classes, and John Conway proves in [3] that instead 



Children should never play with matchsticks or any other source of fire. 
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of the nimber, we need the concept of reduced canonical tree to analyze the classes 
of the misere version. We describe it in the following sections. 

2.3. Game tree. We call game tree obtained from a position ^ the tree where the 
vertices are all the positions that can be reached by playing moves from 3? , and 
where two positions 3P\ and are linked by an edge if &i is obtained from &\ 
in only one move. 

In order to construct the game tree s/ obtained from a position & , we need of 
course to know the rules of the game, so that we can compute the children of 3? , 
but it is important to note that it is then possible to study directly the game tree 
sf, without refering anymore to the underlying position, or even the underlying 
game. 

2.4. Canonical trees. When two branches of a game tree are perfectly identical, 
it means that the player can choose between two moves leading exactly to the same 
situation. Choosing one move or the other will not make any difference in the game, 
so redundant branches in a game tree are useless. We call canonical tree the game 
tree where all redundant branches have been deleted. 

0o (>o 0o = 

Figure 2. Game tree obtained from a given position and corre- 
sponding canonical tree 

Figure [2] shows on the left a game tree obtained from a given Sprouts position. 
The two branches on the right lead to similar games, so we can merge them into 
a single branch to obtain the canonical tree (on the right). The canonical tree, as 
well as the game tree, is of height 2, because the longest possible game ends in 2 
moves. 

The canonical tree corresponding to a given game tree s/ can be defined recur- 
sively: 

• compute the canonical tree of each child of si ' . 

• in these canonical children, delete all the redundant ones. 




Figure 3. Canonization of a game tree 



This notion of canonical tree allows us to keep only the necessary and sufficient 
information of a game tree needed to describe a position: if two positions have the 
same canonical tree, then they are indistinguishable whatever the version of the 
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game is (normal, misere, or any other rule where the result depends only on the 
number of played moves). 

Figure [3] shows the canonization of the game tree resulting from the Sprouts 
position . AB . }AB . }] ! (this position comes from S2, after linking a spot to itself), 
obtained with our program. 

2.5. Reversible moves. 

2.5.1. Reversible moves. 

Definition 4. Let & = {Sfi, ■■■} be a non-empty canonical tree. Then, we 

say that a canonical tree 34? is obtained from & by adding reversible moves if 34? — 
{Sfi , % j ^3 j ■ ■ ■ > &i > &i , j ■ ■ ■ } and each tree Sij has in its children (the moves Zffj 
are said reversible^. 

5f and 34? have the same outcome: if S? is a Win, it means one of its children S$ 
is a Loss. Since ^ is also a child of <^°, it implies that 34? is a Win. Conversely, if 
^ is a Loss, it means all children ^ are wins. Moreover, each Mi is a Win, since it 
has §f as a child, so all children of 34? are wins, and is a Loss. 

But in fact, there is a stronger result, which states that and are indistin- 
guishable in the misere version. Indeed, if a player has a winning strategy for the 
sum Sf + i^, then he can win 34? + 2? ', by playing the same moves, and extending 
the strategy to the following case: 

• if the opponent plays one of the move 8%j , he should answer by playing the 
move & for this component (he "reverses" the move fflj, hence the name 
reversible move). 



Figure |4] shows an example of reversible move: 34? is the tree on the left, and 
the one on the right. There is a reversible move from which two moves are 
possible, and one of them is Sf . 

2.5.2. Particular case of the empty tree. If is the empty tree, the above definition 
holds, but we need an additional clause to ensure that Sf and 34? are indistinguish- 
able in the misere version: 34? must be a Win in the misere version. 

The reason is that when the two players play the game 34° + ST as described 
above, a particular case arises when £? is finished before playing any move in 34? . 
If Sf is not empty, the strategy for playing 34? is described above. 

But if <3 is empty, the game on Sf + SF is supposed to be finished, and the player 
to play should be the winner. However, he finds himself forced to play one of the 
reversible moves &j , and of course, he can win in this case only if 34? is a Win in 
the misere version. The clause (also known as the proviso) ensures that and 34? 
have the same outcome even in this particular case. 




Figure 4. Example of reversible move 
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2.6. Reduced canonical trees. 

2.6.1. Reducers. The previous section shows that if J$? is obtained from by 
adding reversible moves, then Sf and are indistinguishable in the misere ver- 
sion, but what interests us is the reverse way: given a tree ffi , we try to simplify it 
by reducing it to a given tree Sf C , obtained by pruning reversible moves. This 
kind of tree Sf will be called a reducer of . 

Let us remark that is not necessarily unique, which seems at first to be a 
problem when there are several possible reducers. However, the reversible moves 
Ifll-y , &2 , &3 , ■ ■ ■ must all include S? strictly, so their height is at least the height of 
plus 1. It follows that: 

• A reducer is always of the form: {children of JP with a height less than a 
given value}. 

• Two reducers are always comparable for the inclusion. If there are two dif- 
ferent reducers, the smallest one is included in the biggest one, and moreover 
it is a reducer of the biggest one. 

• There exists a (unique) smallest reducer. 

2.6.2. Reduced canonical trees. We can then define the reduced canonical tree of a 
canonical tree, by pruning all the possible reversible moves. This reduced canonical 
tree is constructed recursively: 

• compute the reduced canonical tree of each child. 

• in this reduced canonical children, delete the redundant ones. 

• reduce the resulting tree with the smallest possible reducer. 

2.7. Indistinguishability in misere version. The main interest of these reduced 
canonical trees is that if two positions have the same reduced canonical tree, then 
they are indistinguishable in misere version. It is natural to ask for the converse. 

The converse is true for combinatorial games in normal version: we know that if 
two positions have a different nimber, then they are distinguishable. Indeed, if 
and have different nimbers, then distinguish them, because + is a 
Loss, while + &\ is a Win. 

In the case of a misere game, a result proved in [3] (p. 149) seems at first to 
answer the question: given two different reduced canonical trees 'S and Jf 7 , there 
exists a reduced canonical tree & such that Sf + & and + 8F have different 
misere outcomes, which means that distinguish Sf and J^P . 

From the above result, we could conclude that in the misere version, indistin- 
guishability classes are exactly the reduced canonical trees. However, when study- 
ing a game in particular, it is possible that some positions with different reduced 
canonical trees are in fact indistinguishable. Let us give an example. 

We consider the misere game of Nim, restricted to heaps of size < 2. Since 
1 + 1 ~_ (D, indistinguishability classes are of the form n x 2 or n x 2 + 1 (n > 0). 
The possible moves from this kind of positions are as follows: 

• from n x 2 (71 > 1), we can move to (n — 1) x 2 + 1 or (n — 1) x 2. 

• from n x 2 + 1 (n > 1), we can move to n x 2, (n — 1) x 2 + 1 or (n — 1) x 2. 

It enables us to determine recursively that the only losing positions are 1, and 
2n x 2 (n > 1), and then that the only indistinguishability classes are ; 1 ; 2 
; 2 + 1 ; 2 + 2 ; 2 + 2 + 1. Indeed, when a position includes at least 3 times 2, 
deleting a pair of 2 does not change the outcome. 
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It implies that even if the reduced canonical trees of 2 and 2 + 2 + 2 are different, 
since no reduced canonical tree appearing in the game distinguish them, then they 
are indistinguishable^. Actually, this simplification can happen as soon as all the 
possible reduced canonical trees do not appear in a given game. The new indistin- 
guishability classes, bigger and less numerous, are the root of Thane Plambeck's 
work (see for example [7]) on misere quotients. 

Unfortunately, this theory seems difficult to apply to Sprouts, where a lot of 
different reduced canonical trees appear. Trying to find less numerous indistin- 
guishability classes may still be an option for Sprouts, but we restrained our work 
to the analysis of reduced canonical trees. 

Going back to our example, 2 and 2 + 2 + 2 are on the contrary distinguishable 
within Sprouts: the position ABCD . >ABEF . >CDFE . }] ! , whose reduced canonical tree 
is {1; {2}}, distinguish them, since 2 + {1; {2}} is a Win, while 2 + 2 + 2 + {1; {2}} 
is a Loss. 

2.8. Count. It is interesting to count the canonical trees and the reduced canonical 
trees, in order to evaluate their practical interest. 

First, we can count the exact number of canonical trees of a given height, as on 
figure [5l which shows the 16 canonical trees of height < 3. 




FIGURE 5. Canonical trees of height < 3 



Proposition 1. There is 2( 2< ) (with h+1 times the number 2) canonical trees 
of height < h. 

This can be proved by recursion. Let % denote the set of all canonical trees 
of height < h and c/j its cardinal number. The above formula can then be written 
c/ l+1 = 2 Ch . This comes from the fact that a canonical tree of height < h+1 can 
be defined as the set of canonical trees of its children, which are of height < h. It 
means that there is bijection between the set %+i of all canonical trees of height 
<h + l and ^(tfh), the power set of 

It implies that the number of canonical trees of height < n is, for increasing 
n: 1;2;4;16;65536;2 65536 ... which should be compared with the number of nimbers 
corresponding to trees of height < n: 1;2;3;4;5;6... 

Unfortunately, the number of reduced canonical trees is more similar to the 
first case: 1;2;3;5;22;4171780... (see [5]). In fact, there is a lot of reductions for 
small values, but they rarefy quickly and we come back to a growth of the form 
c n+ i = 2 C ™. For example, 2 22 = 4 1 94 3 04 is very close to 4171780, and the next 
term is more than 99,99% of 2 4171780 (the exact value is given in [3] p. 140). 



^This example is detailed in [8]. 
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FIGURE 6. Reduced canonical trees of height < 3: (D ; 1 ; 2 ; 3 and {2} 

2.9. Nim-heaps. The previous paragraph shows that reduced canonical trees ap- 
pearing in the misere version of Sprouts are much more numerous and complex than 
the nimbers of the normal version. However, almost-terminal positions frequently 
have a very simple reduced canonical tree, namely the tree of a Nim-heap. We 
detail in the following the properties explaining this fact. 

Definition 5. The Mex (minimum excluded) of a set of natural numbers is defined 
as the least natural number which is not included in the set. 

Example: Mex(0; 1; 4; 3; 1; 7) = 2. 

The canonical trees corresponding to Nim-heaps cannot be simplified with re- 
versible moves, but on the contrary, it is possible to simplify a position whose 
children are Nim-heaps (this result, as well as the following one, are proved in [3] 
p. 139): 

Theorem 2. A position whose children are all Nim-heaps is itself indistinguishable 
from a Nim-heap, except if all the children have size at least 2. The position is then 
indistinguishable from the Mex of the children. 

For example, S?\ = {©; 1; 3; 5} (a position whose children are Nim-heaps of size 
0;1;3;5) is indistinguishable from a Nim-heap of size 2: 3P\ — {©; 1; 3; 5} ~ 2. 

On the contrary, 3?2 = {2; 3} (a position whose children are Nim-heaps of size 
2 and 3) cannot be reduced. 

Moreover, Nim-heaps have another interesting property, which means that it is 
sometimes possible to reduce a position constituted of a sum of Nim-heaps: 

Theorem 3. // at least one of the number m or n is equal to or 1, then im + m ~ q, 
where q — m © n. 

"©" is the Nim-sum, which is done by a bitwise "exclusive or" on the two num- 
bers. For example, 3 + 1 ~ 2, or 4 + 1 ~ 5. 

Reduction is not possible if m and n are > 2. In that case, im + in is not indis- 
tinguishable from a given Nim-heap. For example, 2 + 2 = {2 + (D;2 + l} ~ {2;3}, 
and it has already been described above as a position impossible to reduce. 

Let us give an example for Sprouts. We consider the game tree obtained from 
53. The positions contained in this game tree correspond to 55 different canonical 
trees, and only 2 of them are not reducible to a Nim-heap: 

• S2 =0.0.}] !, which has two children, both indistinguishable from 2. Its 
reduced canonical tree is then {2}. 

• . . AB . }AB . }] ! , which is "contaminated" by its child S2 ■ Its reduced 
canonical tree is {1; {2}}. 

The other 53 positions are all reducible to Nim-heaps, which shows the impor- 
tance of this concept when analyzing small positions of misere Sprouts. 
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2.10. Recovery with reversible moves. We have seen in the previous paragraph 
that the position . . AB . }AB . }] ! is "contaminated" by one of its children. But 
when going back up the tree, it can happen that some positions are not, i.e. they 
are indistinguishable from a given Nim-heap, even though some positions of their 
subtree are not. 

For example, this is the case of S3. It has three children, two of them reducible 
to 0, and the other being . . AB . }AB . }] !. S3 is then indistinguishable from 
{0; {1; {2}}}, but this is reducible to 1, because it can be obtained from 1 by 
adding the reversible move {1; {2}}. It means that 5*3 is indistinguishable from a 
Nim-heap, even though it was not the case of one of its children. 




Figure 7. Reduced canonical tree obtained from S3 

This property of "recovery" with reversible moves increases the number of Nim- 
heaps in almost-terminal positions. Of course, it also applies to reduced canonical 
trees more complicated than Nim-heaps: because of reducible moves, the reduced 
canonical tree of a position can effectively be less complicated that the reduced 
canonical trees of its children. For example, this is the case for the tree of figure [4] 
corresponding to the Sprouts position . . A . }1A . }] ! . 

2.11. Factoring by 1. Some reduced canonical trees can be written in the form: 
Sf + 1. It becomes interesting when we consider sums of this kind of tree, because 
we can use the property 1 + 1 — (D to reduce the size of the trees. 

For example, by using that 3 — 2 + 1 and that {3; {2}} — 1 + {2}, we obtain: 
3 + {3; {2}} - 2 + 1 + 1 + {2} - 2 + {2}. It means that the sum of two trees of 
height 3 and 4 has been reduced to the sum of two trees of height 2 and 3. 

Some other sums enable to reduce the size of the trees. Conway notes for example 
that {©; {2}; {3; {2}}} + 2 - {2} in [3 p. 151. But such sums are much too rare 
to be useful and in our program we used only: 1 + 1 — (D. 

3. Computation of reduced canonical trees 
3.1. Representation and storage. 

3.1.1. String representation. The intuitive way of representing RCTsfl is to recur- 
sively define strings, in which each RCT is represented by the set of its children. 

For example, the RCT of S 4 would be represented by: {3; {1; 2; {3; {2}}}}Q. 
But this string representation quickly becomes inefficient when the RCTs grow. If 
the same RCT occurs in several places of a bigger one, its representation is stored 
as many times as it occurs, whereas it is clear that it would suffice to store this 

4 In the following of this article, reduced canonical tree will be shortened RCT. 
^this is a compact form. With a string representation even for Nim-heaps, we should replace 
with {}, 1 with {{}}, 2 with {{}; {{}}} and 3 with {{}; {{}}; {{}; {{}}}} 
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information only once. This defect is even more important when we store many 
RCTs in the database. 

3.1.2. Link representation. To avoid the problem of redundancy in string represen- 
tations, we implemented a representation by link, where each RCT has an identifier 
(a number). An RCT is still represented by the set of its children, but this time, 
we only store the set of their identifiers instead of their complete representation. 
This allows us to store only once each RCT in the database, while we can still refer 
to them several times by using their identifiers. 

But during computations using the RCTs, we frequently need some information 
about a given RCT: its height, and its outcome. Of course, it is possible to compute 
them recursively, but the search of identifiers in the databases would cost much 
running time. 

Consequently, we choose a representation that contains the main information 
needed about an RCT, and the identifier is made of three parameters: 

• the height of the RCT. 

• a number to distinguish RCTs of same height. 

• the character "W" or "L" according to the misere outcome of the RCT. 

By convention, the number is if the RCT is a Nim-heap. Otherwise, the number 
1 is given to the first RCT of a given height that we meet, 2 to the second one... 

The character that describes the outcome of the RCT is useful during the re- 
duction process: a tree is reducible to © only if it is a Win in misere version (cf 
paragraph I3.2.2|l . 

let us give an example with the RCT of figure [5J which arises from Sprouts 
position: 1ABC . }BCDE . }ADE . }] !. The string representation of this RCT is: 
{©;2;{3};{1;3;{2}}}. 




Figure 8. Reduced canonical tree of height 5 

In the following table, for each subtree of this RCT, we give its identifier as well 
as the set of identifiers of its children. 



RCT 


identifier 


set of children 





0-0-W 




1 


1-0-L 


0-0-W 


2 


2-0-W 


0-0-W 1-0-L 


3 


3-0-W 


0-0-W 1-0-L 2-0-W 


{2} 


3-1-L 


2-0-W 


{3} 


4-1-L 


3-0-W 


{1;3;{2}} 


4-2-W 


1-0-L 3-0-W 3-1-L 


{©;2;{3};{1;3;{2}}} 


5-1-W 


0-0-W 2-0-W 4-1-L 4-2-W 
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The RCTs met during the computation are stored in a database similar to the 
last two columns of this table, as ( RCT ; set of children of the RCT ) couples. 

3.1.3. Dependence on the order of computation. However, this link representation 
presents a disadvantage: the number in the identifier depends only on the order 
in which the RCTs have been met, so the same RCT can have different identifiers 
in different computations. As a consequence, RCT databases produced in different 
computations are incompatible. 

We explain in paragraph 14.41 that in our computations, we produced an RCT 
database once and for all to circumvent this problem. 

3.2. Computation of the reduced canonical tree obtained from a given 
position. The definition of paragraph 12.61 provides immediately a recursive algo- 
rithm to compute the RCT of a given position: 

• compute (recursively) the RCT of each child of the position. 

• delete the duplicates amongst these RCTs. 

• reduce the obtained tree, using the smallest possible reducer. 

There is no difficulty in the first two steps, but the programming of the reduction 
should be detailed. 

3.2.1. Reduction in the case of Nim- heaps. The first reduction to be considered is 
the one corresponding to the theorem [2] if all the RCTs obtained from the children 
of the position are Nim-heaps, and if at least one is (D or 1, then the RCT of the 
position is itself a Nim-heap that we can determine with the Mex rule. 

3.2.2. Reduction to (D. Next, we test whether the RCT of the position is reducible 
to (D. We have seen in paragraph 12.5.21 that it is possible only if the position is a 
Win in misere version. 

We must therefore start by checking if a child is a Loss in misere version, which is 
immediate, because the results are stored in the identifiers (without this, we should 
determine the outcome of the position through a computation on the entire tree, 
which would be much more expensive in running time). 

Then, we check if every child is a reversible move, ie if it has (D as a child. 

3.2.3. Other reducers. If none of the previous reductions have worked, we have to 
see if it is possible to find another reducer. So, imagine that we are computing 
the RCT of a position, and that after having removed the duplicates amongst the 
RCTs of its children, we have obtained the set: {&?\) 80% 8/3...}. Let hi denote the 
height of j^i. We sort the RCTs s^i so that (hi) is an increasing sequence. Then 
the observation of paragraph [2jTl] implies that it suffices to test the reducers of the 
form {fffwsfz] where hi+i > hi + 2. 

Coming back to the example of paragraph 13.1.21 let us imagine that we want 
to reduce: {0-0-W 2-0-W 4-1-L 4-2-W 5-0-W 7-0-W}. We then only need to test 
the following potential reducers: 

• {0-0-W} 

• {0-0-W 2-0-W} 

• {0-0-W 2-0-W 4-1-L 4-2-W 5-0-W} 

To be sure of finding the smallest possible reducer, our algorithm examines first 
the smallest potential reducers. 



12 



JULIEN LEMOINE - SIMON VIENNOT 



{0-0-W}=l-0-L is not a correct reducer, because even if 1-0-L is a child of 
2-0-W, it is not a child of 4-1-L. 

Then, our algorithm examines the potential reducer {0-0-W 2-0-W}. It starts 
by looking in the database for the identifier of the RCT that has these two children, 
but it does not find it. For a good reason: {0-0-W 2-0-W} is reducible to 1-0-L 
(see paragraph s. 2. ip . As the smallest possible reducer must be an RCT, and that 
this RCT must be the son of at least one child of the position, the recursive nature 
of the algorithm implies that this RCT has already been stored in the database. 
So, when our algorithm cannot find a potential reducer in the database, it means 
that this potential reducer is itself reducible, and there is no need to try it. 

It remains to test the potential reducer {0-0-W 2-0-W 4-1-L 4-2-W 5-0-W}, 
which is not suitable, since the only children of 7-0-W are Nim-heaps. So it is 
impossible to reduce {0-0-W 2-0-W 4-1-L 4-2-W 5-0-W 7-0-W}. Since it is a new 
RCT, our program generates a new identifier of the form 8-n-W (this new RCT is 
a Win in misere version because the child 4-1-L is a Loss). 

3.3. Factoring by 1. We have seen in paragraph s. Ill that it is useful to determine 
which RCTs could be written as a sum of another RCT and of the Nim-heap 1. We 
present here the implementation of this factorization. The factorization step must 
take place just after the various reductions in the recursive algorithm. 

3.3.1. Representation of a position factorizable by 1. If two RCTs and Jt? con- 
form to & ~ Jff + 1, then the difference between their heights is 1. In order to 
store this relation in our database, we express the identifier of the biggest RCT as 
function of the smallest one, with the following notation. 

We know that 3 — 2 + 1 (or, symetrically, that 2 — 3 + 1). The smallest of these 
two trees is 2. Consequently, 2 keeps the same identifier, 2-0-W, while the identifier 
of 3 will be 2-0+1-W. 

We have also seen that {3; {2}} ~ 1 + {2}. As the identifier of {2} is 3-1-L, 
the identifier of {3; {2}} will be 3-1+1-W (the outcome character "W" corresponds 
to 3-1+1, not to 3-1). 

3.3.2. Detection of a position factorizable by 1. We come back to the example of 
paragraph 13.1.21 Using the new identifier described in the previous paragraph, 
we get: 4-2-W={0-0+l-L 2-0+1-W 3-1-L}. Now, let us compute the children of 
4-2-W+l. To get a child, either we play a move in 4-2-W or in 1. So we have: 
4-2-W+l={0-0-W 2-0-W 3-1+1-W 4-2-W}. 

In this example, we can observe a more general result: if Sf is an RCT that 
cannot be factorized by 1, then & + 1 = {(child of Sf )+l ; So, to determine 
whether the set of the RCTs of the children of a position corresponds to a position 
factorizable by 1, it suffices to check whether the previous set is of the correct form. 
In particular, £f is the only RCT of maximal height that cannot be factorized by 1. 

4. Misere computation algorithm using RCTs 

When we compute the outcome of a position with our program, we explore the 
game tree obtained from this position in order to determine the outcome of the root. 
Details on how we carry out this exploration are available in 6J. What changes 
compared to the normal version is the nature of the nodes of the explored tree. 
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4.1. Simplification of the positions with RCTs. Wc detail the nature of these 
nodes on an example. Consider the Sprouts position: 

. . . . . . . . }] 22 . }] 2ab2ba . }] . . A . }2A . }] ! 

This position has 4 independent positions, more or less complex: 

• 0.0.0.0.0.0.0.0.}] is too big to compute its RCT. 

• 22.}]- 1. 

• 2ab2ba. }] ~ 3 ~ 1 + 2 ~ 1+2-0-W. 

• 0.0.A.}2A.}]~ 1+3-1-L. 

Finally, using 1 + 1 + 1 — 1, we obtain that this position is indistinguishable of 
0.0.0.0.0.0.0.0.}]! +l+{2-0-W+3-l-L}. 

Broadly speaking, a node of the game tree is composed of three parts: 

• the position part, which includes one or more independent positions, too 
big to compute their RCT. 

• the 0/1 part, which is or 1 according to the parity of the number of 1. 

• the RCT part, which contains a list of RCTs. 

4.2. Computation of the children of a node. The children of a node are com- 
puted by taking into account 3 types of possible moves. A move is done in one of 
the node part, leaving the other two unaltered : 

• a normal Sprouts move in the position part 

• a move in the 0/1 part, which consists to replace 1 by 0. 

• a move in the RCT part, which consists to replace an RCT by one of its 
children 

4.3. Interest of RCTs. Replacing independent positions by their RCT when they 
are known has several advantages. Many almost-terminal positions have the same 
RCT, so replacing them by their RCT can simplify the game tree, by reducing the 
number of nodes stored and explored (and therefore, it reduces memory consump- 
tion and improves the running time). 

For bigger positions, we found that RCTs provide relatively few simplifications. 
First, it is rather rare that complex Sprouts positions have the same RCT, and 
second, our canonization of Sprouts positions is already quite good: there are very 
few duplicates amongst the computed children of a given position, so the RCT 
of this position has more or less the same number of children as this position. 
But even so, the RCT proves its usefulness, because computing the children of a 
Sprouts position is complex and expensive in running time, and so the database of 
( RCT ; set of children of the RCT ) couples provides a cache that avoids making 
unnecessarily the same computations of children several times. 

4.4. Criterion to replace a position by its RCT. Unfortunately, computing 
the RCT of a position requires the exploration of its whole game tree. This limits 
the size of the positions that can be replaced by their RCT. Indeed, the size of the 
game tree obtained from a position increases rapidly with the size of the position. 
It can be seen in the following table where we counted the number of canonical 
trees obtained from different starting positions of Sprouts. 

We considered two different criteria to decide when to compute the RCT of a 
position. The first is to compute the RCT of all positions below a certain limit num- 
ber of lives. This method has the advantage to adapt to the current computation, 
because we compute only the RCTs of positions actually encountered. But it has 



14 



JULIEN LEMOINE - SIMON VIENNOT 



starting spots 


number of canonical trees 


2 


10 


3 


55 


4 


713 


5 


10461 


6 


150147 



Figure 9. Number of different canonical trees in game trees ob- 
tained from a starting position 



the disadvantage of changing the RCT databases during the computation. Above 
all, this criterion is not well adapted to the structure of Sprouts: two positions 
with the same number of lives can lead to trees of completely different complex- 
ity. For example, there are 55 different canonical trees in the game tree obtained 
from 5*3 (ie 9 lives), whereas there are 478 in the tree resulting from the position 
labcde2edcba. 2 . }] ! (which also has 9 lives). 

We have therefore chosen another criterion. The starting positions with p spots, 
and their descendants, quickly appear in the computation of positions with a higher 
number of spots. We therefore computed the RCT of S 6 . Once this RCT computed, 
we do not compute any other, and we thus have a fixed RCTs database. 

For example, from S± 2 , we play the move . . . . . . . . AB . }0 . . . AB . }] ! , 
and then the move 0.0.0.0.0.0.0.0.}]0.0.A.}0.A.}] !. At this time, our algo- 
rithm modifies the node, because we can read in the database obtained from Sq 
that . . A . }0 . A . }] ! ~3-l+l-W. On the other side, the rest of the position does not 
appear in this database. Thus the new node is 0.0.0.0.0.0.0.0.}] ! +1+3-1-L. 

4.5. Sums of positions. To simplify the nodes, we could consider merging the 
0/1 part and the list of RCTs in a single RCT, by computing their sum. In the 
case of paragraph 14.11 we would replace l+{2-0-W+3-l-L} by a single RCT of 
height 6, 5-1+1-W. The interest of this approach is twofold: in addition to writing 
simpler nodes, it would detect the simplifications described at the end of paragraph 

However, this method is not practical, because the computation of the sum of 
RCTs requires the storage of too many additional RCTs. For example, after the 
storage of the RCT of Sq, let us suppose that we want to calculate the result of the 
position 0.0. 0.0.0.}] 0.0. 0.0.}] !. The RCTs of these two independent positions 
are known (and their respective heights are 13 and 6). 

The algorithm described above would replace each of these two positions by 
its RCT, and thus start the computation on the node: ! +0+{13-7-W+6-208-L}. 
Then, the classical Win/Loss algorithm we used allows us to find the result of this 
position by storing only 10 losing positions, while in contrast, the computation of 
the RCT of 13-7-W+6-208-L needs to store more than 35,000 new RCTs, ie more 
than for the computation of the RCT of Sq. 

During a misere computation, we frequently need to consider such sums (and 
even more complex ones), so it is not possible to compute the resulting RCTs. 
Finally, the simplifications expected at the end of paragraph s. 1 li do not compensate 
for this problem: during the tests we conducted, we have observed none. 
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5. Results 

5.1. Phase of preliminary computations. The following table summarizes the 
number of RCTs involved in game trees obtained from starting positions, and allows 
us to imagine the memory needed to take SV as the basis of our computations instead 
of Sq (we have not computed it due to memory limitations, but it may be accessible: 
only 45 MB of RAM are needed to store S e ). 



starting spots 


number of RCTs 


number of positions stored 


2 


5 


18 


3 


7 


157 


4 


35 


1796 


5 


1204 


24784 


6 


25459 


393103 



These numbers of RCTs are an objective data of the Sprouts game, which should 
be checkable by other programmers. In contrast, the number of positions encoun- 
tered in the game trees obtained from starting positions depends on our Sprouts 
positions canonization. This number is an indicator of the quality of our canon- 
ization. As our canonization only uses simplifications that preserve the canonical 
trees, it is wise to compare this column with table [S] 

While many almost-terminal positions are indistinguishable from Nim-heaps, 
some quite complex positions are indistinguishable from Nim-heaps as well. For 
example, 0.0.0.0.2.}] ! ~ B. Conversely, the position with a minimum number of 
lives that is not a Nim-heap is ABC . }ABD . >CE . }DE . }] ! ~ {2}. 

5.2. Computation of S 1 ^.. The implementation of the techniques described in this 
article allowed us to compute that is a Win. The computation required the 
storage of approximately 170,000 nodes. Of these nodes, about half had an empty 
position part, which shows that we can explore the game tree so that the position is 
dismantled rather quickly into several independent positions small enough for their 
RCT to be in the game tree of Sq. 

The exploration of the game tree was performed in the same way as what we 
described in [6]: a depth-first algorithm, in which we track the exploration, and 
manually choose to explore the branches that seem the most hopeful. We also used 
a check algorithm : when a computation is finished, this algorithm only keeps the 
positions needed to demonstrate the outcome. It reduces significantly the sizes of 
the databases: for example, amongst the 170,000 positions computed, less than 
18,000 are necessary to demonstrate S 1 ^. 

The computation time was only twenty hours on a 1.8 GHz processor, and RAM 
consumption less than 100 MB, so we can reasonably expect to improve this record. 
Figure [TU] shows the number of positions needed to demonstrate S~ (after having 
used a check algorithm) . This number gives a good idea of the complexity of S~ , 
so that we could imagine the difficulty of computing higher values of p. 



p 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


Outcome of S~ 


L 


L 


L 


w 


W 


W 


L 


L 


L 


W 


W 


Positions 


7 


24 


44 


114 


79 


983 


1082 


3517 


6906 


8981 


17583 



Figure 10. Number of positions needed to demonstrate S, 
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It is worthwhile to be aware that this number is the number of positions we need 
in addition to the game tree of S6 to compute the outcome of S~. Of course, we 
can't find the outcome of S 7 with only 7 positions... 

The programming methods described in this article will probably help to deter- 
mine the outcome of S^ s or S± g fast enough, but it is unlikely that the study of 
the misere game will give results equivalent to the normal version. Indeed, because 
of theoretical difficulties related to the misere version, many more nodes must be 
explored to demonstrate a result. Also, unlike in the case of normal Sprouts game, 
in which the complexity of S p does not increase strictly with p (e.g. it is faster to 
compute Syj than S^ 5 , and we were able to compute S^ 7 but not S 33 ), the difficulty 
seems to increase more regularly for the misere version. 

6. Conclusion 

The theory of reduced canonical trees and the algorithm that results for misere 
impartial games were particularly effective in the case of Sprouts, by using efficiently 
the divisions in sums of independent positions. The same algorithm could be used 
for computing other impartial games in misere version, provided that such divisions 
in sums of independent position occur in these games. 

The potential improvements may relate to various areas: in addition to purely 
theoretical improvements, and of course to improvements due to the increasing 
performance of computers, we may hope for improvements in the game trees explo- 
ration. 

The program we used for computation is available with its source code on our 
web site http://sprouts.tuxfamily.org/ under a GNU licence, together with 
several databases used during our computations. 
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